Deep learning for optical coherence tomography segmentation

ABSTRACT

Systems and methods are presented for providing a machine learning model for segmenting an optical coherence tomography (OCT) image. A first OCT image is obtained, and then labeled with identified boundaries associated with different tissues in the first OCT image using a graph search algorithm. Portions of the labeled first OCT image are extracted to generate a first plurality of image tiles. A second plurality of image tiles is generated by manipulating at least one image tile from the first plurality of image tiles, such as by rotating and/or flipping the at least one image tile. The machine learning model is trained using the first plurality of image tiles and the second plurality of image tiles. The trained machine learning model is used to perform segmentation in a second OCT image.

BACKGROUND Field of the Disclosure

The present disclosure relates to image processing, and morespecifically, to using machine learning models to perform biomedicalimage segmentation according to various embodiments of the disclosure.

Description of Related Art

In certain biomedical fields such as ophthalmology, images (e.g., anx-ray image, an optical coherence tomography (OCT) image, etc.) ofpatients' body parts (e.g., an eye) may be captured and analyzed fordetermining diagnoses for the patients. When analyzing the images,automated segmentation of elements within the images can transformqualitative images into quantitative measurements, which are helpful forboth diagnostics and surgical guidance. However, automated imagesegmentation can be challenging. For example, due to artifacts thatappear on the image such as speckles, the continuous thin boundariesbetween different types of tissues in an OCT image may becomediscontinuous, which makes it challenging for automatically identifyingthe different types of tissues in the OCT image. Furthermore,complicated pathological conditions may also make the image segmentationchallenging.

Conventional segmentation algorithms rely on explicit description of theproblem as well as detailed steps (e.g., explicit rules provided bydesigners of the algorithms) to solve the problem. This approach workswell for images obtained from normal subjects (patients with nodiseases), whose anatomical structures follow rules that can beestablished from a normative human database. However, for human subjectswith different diseases, the anatomical structures can varysubstantially from normal conditions, making OCT image segmentationchallenging. For example, the boundaries between different types oftissues within an eye of someone who has a pathological condition maynot follow the patterns of a normal eye. Therefore, there is a need inthe art for providing an effective mechanism for automaticallysegmenting an OCT image.

SUMMARY

According to some embodiments, a system includes a non-transitory memoryand one or more hardware processors configured to read instructions fromthe non-transitory memory to cause the system to perform operationscomprising: obtaining an optical coherence tomography (OCT) image;determining edges in the OCT image based on an edge detection algorithm;generating a plurality of image tiles based on the OCT image; generatinga plurality of additional image tiles by manipulating at least one imagetile of the plurality of image tiles; and training a machine learningmodel for predicting edges in OCT images based on the plurality of imagetiles and the plurality of additional image tiles.

According to some embodiments, a method includes obtaining a biomedicalimage; determining boundaries of different tissues in the biomedicalimage based on an edge detection algorithm; generating a first pluralityof image tiles based on the biomedical image; generating a secondplurality of image tiles by manipulating at least one image tile of thefirst plurality of image tiles; and training, by one or more hardwareprocessors, a machine learning model for segmenting biomedical imagesbased on the first plurality of image tiles and the second plurality ofimage tiles.

According to some embodiments, a non-transitory machine-readable havingstored thereon machine-readable instructions executable to cause amachine to perform operations including: obtaining an optical coherencetomography (OCT) image; analyzing the OCT image based at least in parton an edge detection algorithm; generating a first plurality of imagetiles based on the analyzing the OCT image; generating a secondplurality of image tiles by manipulating at least one image tile of thefirst plurality of image tiles; and training a machine learning modelfor segmenting OCT images based on the first plurality of image tilesand the second plurality of image tiles.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present technology, itsfeatures, and its advantages, reference is made to the followingdescription, taken in conjunction with the accompanying drawings.

FIG. 1 is a diagram of a system for analyzing biomedical imagesaccording to some embodiments.

FIG. 2 is a diagram of a training module for training a machine learningmodel for segmenting biomedical images according to some embodiments.

FIG. 3 illustrates a process of training the machine learning modelaccording to some embodiments.

FIG. 4A illustrates an exemplary method for dividing a biomedical imageaccording to some embodiments.

FIG. 4B illustrates an exemplary method for extracting image tiles froma biomedical image according to some embodiments.

FIG. 5 illustrates generating additional image types by manipulating atleast one image tile according to some embodiments.

FIGS. 6A and 6B are diagrams of processing systems according to someembodiments.

FIG. 7 is a diagram of a multi-layer neural network according to someembodiments.

In the figures, elements having the same designations have the same orsimilar functions.

DETAILED DESCRIPTION

This description and the accompanying drawings that illustrate inventiveaspects, embodiments, implementations, or modules should not be taken aslimiting—the claims define the protected invention. Various mechanical,compositional, structural, electrical, and operational changes may bemade without departing from the spirit and scope of this description andthe claims. In some instances, well-known circuits, structures, ortechniques have not been shown or described in detail in order not toobscure the invention. Like numbers in two or more figures represent thesame or similar elements.

In this description, specific details are set forth describing someembodiments consistent with the present disclosure. Numerous specificdetails are set forth in order to provide a thorough understanding ofthe embodiments. It will be apparent, however, to one skilled in the artthat some embodiments may be practiced without some or all of thesespecific details. The specific embodiments disclosed herein are meant tobe illustrative but not limiting. One skilled in the art may realizeother elements that, although not specifically described here, arewithin the scope and the spirit of this disclosure. In addition, toavoid unnecessary repetition, one or more features shown and describedin association with one embodiment may be incorporated into otherembodiments unless specifically described otherwise or if the one ormore features would make an embodiment non-functional.

The technology described below involves systems and methods to provide amachine learning model for automatically segmenting an image (e.g., abiomedical image such as an x-ray image, an optical coherence tomography(OCT) image, etc.), where the machine learning model is trained usingtraining data artificially generated based on manipulations of existingtraining images. As discussed above, OCT image segmentation can bechallenging due to factors such as artifacts (e.g., speckles) thatappear on the image and complicated pathological conditions of thepatients. Conventional algorithms, such as a graph search algorithm,have been used (e.g., implemented in computers) for performing OCT imagesegmentation in the past. However, these algorithms may be effectiveonly for performing segmentation on images of normal patients (e.g.,patients with no diseases), whose anatomical structures follow rules (orpatterns) that can be established from a normative human database. Thesealgorithms may not be effective for performing segmentation on OCTimages that include artifacts and/or OCT images taken of patients whohave pathological conditions, as the anatomical structures of thesepatients with complicated pathological conditions may follow differentpatterns or follow no patterns at all.

In some embodiments, a machine learning model may be configured toperform segmentation of images (e.g., x-ray images, OCT images, etc.).The machine learning model may be trained using training data, such asimages of patients obtained in the past. One advantage of using amachine learning model, such as a convolutional neural network (CNN)over a conventional algorithm to perform biomedical image segmentationis that the machine learning model does not rely on explicit rulesregarding how to segment images. Rather, with sufficient training data,the machine learning model can derive the rules by itself andcontinuously evolve (e.g., modify and/or correct the rules) based on newtraining data. As such, given a large amount of high quality trainingdata, the machine learning model can be trained to accurately andeffectively segment images. In the past, generating the training datarequires human operators to manually analyze and label the boundaries ofthe different tissue types in existing images to determine the groundtruth. However, not only that it is tedious and error prone to generatetraining data in this manner, it is also difficult to generate a largeamount of training data due to the manual labor involved. As theperformance of a machine learning model is largely dependent on theamount and the quality of training data that trains the machine learningmodel, the performance of the machine learning model would likely sufferwhen the training data is generated in this manner.

As such, according to various embodiments of the disclosure, a trainingsystem may be provided to automatically generate a large amount of highquality training data for training a machine model configured to performimage segmentation. In some embodiments, the training system may obtaintraining images. The training images may be existing images taken ofpatients in the past. In some embodiments, the training data system mayuse a conventional algorithm (e.g., a graph search algorithm) as well asmanual or semi-automated annotations to analyze and label the trainingimages (e.g., by identifying boundaries (also referred to as edges) ofdifferent types of tissues, such as different layers of an eye, in thetraining images). As discussed herein, one drawback of using aconventional algorithm to segment images is that the conventionalalgorithm may not be effective in performing segmentation on imageshaving a substantial amount of artifacts (e.g., speckles) or images ofpatients with different pathological conditions (e.g., different eyediseases). As such, the training data generated by using theconventional algorithm may be limited to only clean images (e.g., imagesthat do not have substantial amounts of artifacts) and images of normalpatients. In order to expand the training data to cover images ofpatients having various pathologies, the training system of someembodiments may artificially generate additional training data bymanipulating the existing training images.

In some embodiments, the training system may obtain image tiles (e.g.,patches) from each training image. Different embodiments may usedifferent techniques in obtaining image tiles from a training image. Insome embodiments, the training system may divide the training image intomultiple tiles. For example, from an image having a size of 160 by 40pixels, the training system may divide the image into sixty-four (64)equally-sized (10 by 10 pixels) tiles. In some embodiments, the trainingsystem may obtain a device attribute (e.g., a memory size of a graphicalprocessing unit) of the device that is configured to generate thetraining data, and may divide the image based on the device attribute.For example, the training system may determine a tile size that does notexceed the memory size of the graphical processing unit of the device,and may then divide the image into tiles based on the tile size.

In some embodiments, the training system may also perform one or moreanalyses on the image and may divide the image based on the one or moreanalyses. For example, the training system may perform a pixel analysisto determine portions of the image that do not include relevant data(e.g., portions of the image that include background or blank data). Inthis regard, the training system may analyze the pixel value of eachpixel in the image to determine portions of the image having contiguouspixels with substantially similar (or identical) pixel values (e.g.,spatial frequency within the portions below a threshold). The trainingsystem may then eliminate (e.g., remove) the portions of the imagebefore dividing the image into the multiple tiles.

In some embodiments, instead of dividing the image into multiple tiles,the training system may generate the image tiles by extracting differentimage portions from the image. For example, the training system maygenerate a virtual window having the determined tile size (e.g., 10 by10 pixels). The training system may place the virtual window at aninitial position (e.g., top left corner) of the image. The trainingsystem may analyze the portion of the image within the virtual window todetermine whether the portion of the image passes a relevance threshold.If it is determined that the portion of the image passes the relevancethreshold, the training system may extract the portion of the image asan image tile. On the other hand, if it is determined that the portionof the image does not pass the relevance threshold, the training systemmay ignore the portion of the image. In some embodiments, the trainingsystem may determine whether a portion of image passes the relevancethreshold based on one or more factors, such as whether a spatialfrequency of the portion of image exceeds a spatial frequency threshold,whether the portion of the image includes labeled data (e.g., includes aportion of a boundary of different tissues labeled by the graph searchalgorithm), etc.

After extracting (or ignoring) the portion of the image within thevirtual window, the training system may then move the virtual window toanother position to cover another portion of the image (e.g., moving thevirtual window by a predetermined number of pixels to the right, to thebottom, etc.). The training data system may continue to analyzedifferent portions of the image covered by the virtual window andextract the portions that pass the relevance threshold. Based on thepredetermined movement of the virtual window, the different portions ofthe image cover by the virtual window may or may not overlap with eachother, such that the image tiles extracted from the image may bepartially overlapping. Each image tile that is extracted from the imagecan become a distinct piece of training data for training the machinelearning model. By independently analyzing different portions of theimage and extracting only the portions that are relevant maysubstantially improve the quality of the training data.

Since the initial training images are images of normal patients (e.g.,patients with no diseases), the tiles obtained from the images stillwould not be able to provide adequate training to the machine learningmodel for patients who have different types of pathologies. Thus, insome embodiments, the training system may generate additional trainingdata corresponding to patients having various pathologies bymanipulating the image tiles. For example, the training system mayadjust the orientation of each tile (which effectively changing anorientation of the boundaries of the layers in the image tiles) tosimulate images of patients having various pathologies. In someembodiments, the training system may generate additional tiles byrotating each tile by one or more rotations, where each additional tilecorresponds to rotating the tile to a predetermined degree of rotation.For example, the training system may be configured to rotate each tileby 90 degrees, 180 degrees, and 270 degrees. Thus, for each originalimage tile, the training system may generate three additional tiles(e.g., three additional pieces of training data)—a first additional tilethat corresponds to rotating the original tile by 90 degrees, a secondadditional tile that corresponds to rotating the original tile by 180degrees, and a third additional tile by rotating the original tile by270 degrees. Different degrees of rotation may be used and/or differentnumbers of additional tiles may be generated in other embodiments.

In some embodiments, instead of or in addition to rotating the tiles,the training system may also generate additional tiles by flipping eachoriginal tile and each additional tile along an axis (e.g., a horizontalaxis, a vertical axis, etc.). For example, by flipping a tile along ahorizontal axis and a vertical axis of a given tile, the training systemmay generate two additional tiles based on the given tile. In theexample where the training system rotate each original tile by 90degrees, 180 degrees, and 270 degrees, and then flip the original tileand the rotated tiles along a horizontal axis and a vertical axis, thetraining system may be able to produce 11 additional tiles based on anoriginal tile. Thus, the training system may increase the amount oftraining data by 11 folds, where the additional training data may coverthe instances of patients having various pathologies. Furthermore, thetraining system may also generate additional image tiles by generatingdifferent versions of the same image tile, where each version includesadded artifacts (e.g., different amounts of speckles, etc.) in the imagetile.

The training system may then train the machine learning model using thegenerated training data (e.g., the tiles and the additional tiles). Insome embodiments, the machine learning model may be implemented as adeep convolutional neural network. When training the machine learningmodel, each piece of training data (e.g., each tile) is firstdown-sampled through a set of convolution layers, and then up-sampledthrough a corresponding set of convolution layers. Through thedown-sampling and up-sampling of training data, the machine learningmodel may be trained to identify boundaries of tissues within an OCTimage. After training, the machine learning model may be used toidentify boundaries of tissues of new OCT images of patients. In someembodiments, the machine learning model may be periodically re-trainedusing new training data. For example, when a new OCT image is obtained,the training system may be configured to generate training data usingthe method described herein, and retrain the machine learning modelusing the newly generated training data.

FIG. 1 illustrates a system 100 within which the training system asdiscussed herein may be implemented according to some embodiments.System 100 includes a biometrics analysis platform 102 coupled with oneor more eyecare professional (ECP) devices, such as ECP devices 130,140, and 150 via a network 115. In some examples, network 115 mayinclude one or more switching devices, routers, local area networks(e.g., an Ethernet), wide area networks (e.g., the Internet), and/or thelike.

Each of the ECP devices (e.g., the ECP devices 130, 140, and 150) mayinclude a user interface (UI) application and an ECP identifier. Forexample, the ECP device 130 includes a UI application 132 and an ECPidentifier 134. The UI application 132 may be used by a correspondingECP (e.g., the ECP 170) to interact with the biometrics analysisplatform 102. For example, the UI application 132 may be a web browseror a client application (e.g., a mobile application). The eyecareprofessional (ECP) 170, via the UI application 132 may access agraphical user interface (GUI), such as a webpage generated and/orhosted by the biometrics analysis platform 102. The ECP identifier 134is an identifier that uniquely identifies the ECP 170 among multiple ECPserviced by the lens selection platform 102.

The biometrics analysis platform 102 includes a user interface (UI)server 103, a biometrics analysis engine 106, a training module 107, andan image segmentation model 108. The interface server 103, in someembodiments, is configured to provide a user interface (e.g., agraphical user interface (GUI), etc.) on the ECP device 130, 140, and150, via which the ECPs such as the ECP 170 may interact with thebiometrics analysis platform 102. For example, the UI server 103 of someembodiments may include a web server that hosts a website associatedwith the lens selection platform 102. The UI server 103 may generateand/or store one or more interactive webpages that may present on theECP devices via the UI application (e.g., the UI application 132). Inanother example, the UI server 103 may include an application serverthat interacts with a client application (e.g., the UI application 132)via a protocol (e.g., REST protocol, etc.).

The image segmentation model 108 may be a machine learning model (e.g.,a convolutional neural network, etc.) that is configured to performsegmentation on images (e.g., identify boundaries of different tissueson an image). The training module 107 may be configured to train theimage segmentation model 108 by generating training data usingtechniques disclosed herein. The training module 107 may obtain imagesof patients (e.g., OCT images of patients' eyes, etc.). The trainingmodule 107 may use a conventional algorithm (e.g., a graph searchalgorithm) to analyze and label the boundaries of different tissues onthe images. The training module 107 may then artificially generateadditional training data using the techniques disclosed herein. Forexample, the training data module 107 may obtain tiles from the image(e.g., by dividing the image or extracting tiles from the image) andmanipulate each tile (e.g., by changing an orientation of the tile,adding artifacts to the tile, etc.) to generate the additional trainingdata. The training module 107 may then train the image segmentationmodel 108 using the generated training data. After training the imagesegmentation module 108, the image segmentation model 108 may be used bythe biometrics analysis engine 106 for augmenting images (e.g., OCTimages).

In some embodiments, an ECP (e.g., the ECP 170) may provide, via the UIapplication (e.g., the UI application 132) and the user interfaceprovided by the UI server 103, image data (e.g., an OCT image) of an eyeof a patient. For example, the ECP 170 may use the diagnostic device 160to capture the image (e.g., an OCT image) of the eye of the patient. Insome embodiments, the ECP device 130 may be coupled to the diagnosticdevice 160 such that the ECP device 130 may automatically retrieve theimage from the diagnostic device and transmit the image to thebiometrics analysis platform 102 via the UI server 103.

In some embodiments, upon receiving the image, the biometrics analysisengine 106 may analyze the image and provide a diagnosis and/or otherinformation regarding the patient's eye to the ECP 170 based on theimage. For example, the biometrics analysis engine 106 may use thetrained image segmentation model 108 to identify boundaries of differenttissues (e.g., different corneal layers) in the image. The biometricsanalysis engine 106 may then augment the image by highlighting theidentified boundaries in the image and present the augmented image onthe ECP device 130. The augmented image may assist the ECP 170 indiagnosis and/or surgical guidance for the patient. In some embodiments,the biometrics analysis engine 106 may analyze the augmented image toprovide additional recommendations, such as a selection of anintra-ocular lens or a contact lens for a patient based on the image.

FIG. 2 illustrates a training module according to various embodiments ofthe disclosure. As shown, the training module 107 includes asegmentation module 202 and a tiles generation module 204. The trainingmodule 107 may use the segmentation module 202 to analyze and labelexisting images (e.g., an image 222), for example, by using agraph-search algorithm. The training module 107 may then use the tilesgeneration module 204 to obtain image tiles from each of the labeledimages as training data for training the image segmentation model 108.For example, the tile generation module 204 may divide the image 222into image tiles (e.g., image tiles 224 a-224 d, also referred to asoriginal image tiles 224 a-224 d). The tile generation module 204 maythen manipulate the original image tiles 224 a-224 d to generateadditional image tiles. In some embodiments, the tile generation module204 may rotate each of the original image tiles 224 a-224 d a number oftimes by different degrees of rotations to generate additional imagetiles. Furthermore, the tiles generation module 204 may also flip eachoriginal image tiles 224 a-224 d and each additional image tile along anaxis (e.g., a horizontal axis, a vertical axis, etc.) to generateadditional image tiles for training the image segmentation model 108.

FIG. 3 illustrates a process 300 for training an image segmentationmodel configured to perform segmentation on images according to oneembodiment of the disclosure. In some embodiments, the process 300 maybe performed by the training module 107 and/or the biometrics analysisengine 106. The process 300 begins by obtaining a first opticalcoherence tomography (OCT) image. For example, the training module 107may obtain training images, such as existing images taken of patients(e.g., existing OCT images taken of patient's eyes) in the past. In someembodiments, the existing images can be obtained from one or more ECPdevices, such as ECP devices 130, 140, and 150. For example, ECPs (e.g.,the ECP 170) may capture OCT images of patients (e.g., using diagnosticdevices such as the diagnostic device 160). The ECPs may transmit theOCT images to the biometrics analysis platform 102 for analysis, forexample, for performing segmentation on the images.

The process 300 then performs (at step 310) segmentation on the firstOCT image using an algorithm and generate (at step 315) image tiles fromthe first OCT image. For example, the segmentation module 202 of thetraining module 107 may use a conventional algorithm (e.g., a graphsearch algorithm) to analyze and label the obtained images (e.g., byidentifying boundaries of different types of tissues, such as differentlayers of an eye, in the training images). FIG. 4A illustrates anexemplary OCT image 402 that may be obtained from the ECP device 130. Inthis example, the OCT image 402 is an image of a patient's eye, andspecifically, different corneal layers of the eye. For example, the OCTimage 402 may show the eye including a layer 422 and a layer 424. Asshown, due to artifacts and other issues with the image 402, theboundaries of the layers 422 and 424 may not be very clear, and/or maydiscontinue. As such, the segmentation module 202 may use a graph searchalgorithm to identify boundaries of the different layers. As shown inthe image 402, by using the graph search algorithm, the segmentationmodule 202 may highlight the boundaries of the layers, includingboundaries 432 and 434 for the layer 422, and boundaries 436 and 438 forthe layer 424.

The training module 107 may then use the labeled images (e.g., thelabeled OCT image 402) as training images for training the imagesegmentation module 108. As shown in FIG. 4A, the layers 422 and 424exhibits one or more patterns with distinctive features. For example,the layer 422 has a wave patterns include multiple peaks and troughs,where each cycle of the wave has distinctive features or characteristics(e.g., amplitudes, thicknesses, etc.). Any portion of the layer 422 mayinclude distinct characteristics for training the image segmentationmodel 108. Similarly, the layer 424 includes discontinued patches ofelongated-shaped elements, wherein each of these elements may havedistinctive features or characteristics for training the imagesegmentation model 108. As such, in some embodiments, instead of usingthe image 402 as a whole as a piece of training data, the trainingmodule 107 may obtain tiles (or patches) of the image 402 as trainingdata.

Different embodiments may use different techniques in obtaining imagetiles from a training image (e.g., the image 402). In some embodiments,the tiles generation module 204 of the training module 107 may dividethe training image into multiple tiles. For example, when the image 402has a size of 160 by 40 pixels, the tiles generation module 204 maydivide the image 402 into sixty-four (64) equally-sized (10 by 10pixels) tiles. As shown in FIG. 4A, the tiles generation module 204 mayuse virtual lines 412-420 to divide the image 402 up into multipletiles, such as tiles 442-448. In some embodiments, the tiles generationmodule 204 may obtain a device attribute (e.g., a memory size of agraphical processing unit) of the device (e.g., a computer server suchas the biometrics analysis platform 102, etc.) that is configured togenerate the training data, and may divide the image based on the deviceattribute. For example, the tiles generation module 204 may determine atile size that does not exceed the memory size of the graphicalprocessing unit of the device (e.g., 8 GB, 16 GB, etc.), and may thendivide the image 402 into tiles based on the tile size, such that eachtile may have a size not exceeding the memory size of the graphicalprocessing unit.

In some embodiments, the tiles generation module 204 may also performone or more analyses on the image 402 and may divide the image based onthe one or more analyses. For example, the tiles generation module 204may perform a pixel analysis to determine portions of the image that donot include relevant data (e.g., portions of the image that includebackground or blank data). In this regard, the tiles generation module204 may analyze the pixel value of each pixel in the image to determineportions of the image having contiguous pixels with substantiallysimilar (or identical) pixel values (e.g., spatial frequency within theportions below a threshold). The tiles generation module 204 may theneliminate (e.g., remove) the portions of the image before dividing theimage into the multiple tiles. For example, the tiles generation module204 may determine that a portion 450 of the image 402 does not haverelevant data based on low spatial frequency and lack of labeled data(e.g., labeled boundary) within the portion 450. Thus, the tilesgeneration module 204 may remove the portion 450 from the image 402before dividing the image 402 into tiles.

In some embodiments, instead of dividing the image into multiple tiles,the tiles generation module 204 may generate the image tiles byextracting different image portions from the training image (e.g., theimage 402). For example, the tiles generation module 204 may provide avirtual window having the determined tile size (e.g., 10 by 10 pixels)on the image. The tiles generation module 204 may place the virtualwindow at an initial position (e.g., top left corner) of the image. Thetiles generation module 204 may analyze the portion of the image withinthe virtual window to determine whether the portion of the image passesa relevance threshold. If it is determined that the portion of the imagepasses the relevance threshold, the tiles generation module 204 mayextract the portion of the image as an image tile. On the other hand, ifit is determined that the portion of the image does not pass therelevance threshold, the tiles generation module 204 may ignore theportion of the image. In some embodiments, the tiles generation module204 may determine whether a portion of image passes the relevancethreshold based on one or more factors, such as whether a spatialfrequency of the portion of image exceeds a spatial frequency threshold,whether the portion of the image includes labeled data (e.g., includes aportion of a boundary of different tissues labeled by the graph searchalgorithm), etc.

FIG. 4B illustrates a virtual window 462 provided on the image 402, forexample, by the tiles generation module 204. The virtual window 462 isprovided at an initial position (e.g., top left corner) that covers afirst image portion 472 of the image 402. The tiles generation module204 may analyze the image portion 472 of the image 402 within thevirtual window 462 to determine whether the image portion 472 passes arelevance threshold. For example, the tiles generation module 204 mayanalyze the pixel values of the image portion 472 to determine whether aspatial frequency exceeds a predetermined threshold. The tilesgeneration module 204 may also determine whether labeled data (e.g.,identified boundary based on the graph search algorithm) is includedwithin the image portion 472. The tiles generation module 204 may thendetermine whether the image portion 472 of the image 402 passes therelevance threshold, for example, based on the spatial frequency and/orthe existence of labeled data of the image portion 472 of the image 402.If it is determined that the image portion 472 passes the relevancethreshold, the tiles generation module 204 may extract the image portion472 of the image 402 as an image tile. On the other hand, if it isdetermined that the image portion 472 of the image 402 does not pass therelevance threshold, the tiles generation module 204 may ignore theimage portion 472. In this example, since the image portion 472 includesa part of the labeled boundary 432, the tiles generation module 204 maydetermines that the image portion 472 passes the relevance threshold,and thus extract the image portion 472 from the image 402.

After extracting (or ignoring) the portion of the image within thevirtual window, the tiles generation module 204 may then move thevirtual window to another position to cover another portion of the image(e.g., moving the virtual window by a predetermined number of pixels tothe right, to the bottom, etc.). For example, as shown in FIG. 4B, thetiles generation module 204 may move, after extracting or ignoring theimage portion 472, the virtual window 462 a predetermined number ofpixels (e.g., 5 pixels) to the right to cover a second image portion 474of the image 402. the tiles generation module 204 may continue toanalyze different portions of the image covered by the virtual windowand extract the portions that pass the relevance threshold. Based on thepredetermined movement of the virtual window, the different portions ofthe image cover by the virtual window may or may not overlap with eachother, such that the image tiles extracted from the image may bepartially overlapping. In this example, the image portions 472 and 474partially overlap with each other. Each image tile that is extractedfrom the image can become a distinct piece of training data for trainingthe image segmentation model 108. By independently analyzing differentportions of the image and extracting only the portions that are relevantmay substantially improve the quality of the training data.

FIG. 5 illustrates exemplary image tiles 502-512 obtained from the image402 (either by dividing the image 402 or by extracting the tiles fromthe image 402 using a virtual window). Each of the image tiles (alsoreferred to as the original image tiles) may be used as a piece oftraining data for training the image segmentation model 108. However, asdiscussed herein, one drawback of using a conventional algorithm tosegment images is that the conventional algorithm may not be effectivein performing segmentation on images having a substantial amount ofartifacts (e.g., speckles) or images of patients with differentpathologies (e.g., different eye diseases). As such, the training data(e.g., the original image tiles) generated by using the conventionalalgorithm may be limited (e.g., only images that do not have substantialamounts of artifacts and images of normal patients are labeled). Thus,in some embodiments, the training module 107 may artificially generateadditional training data corresponding to patients having variouspathologies by manipulating the original image tiles.

Referring back to FIG. 3 , the process 300 generates (at step 320)additional training images by changing orientations of the tiles. Forexample, the training module 107 may manipulate the original image tilesby adjust the orientation of each original image tile (which effectivelychanging an orientation of the identified boundaries of layers in theimage tiles) to simulate images of patients having various pathologies.In some embodiments, the training module 107 may generate additionaltiles by rotating each original image tile by one or more rotations,where each additional tile corresponds to rotating the tile to apredetermined degree of rotation. For example, the training module 107may manipulate each original image tile by rotating each original imagetile by 90 degrees, 180 degrees, and 270 degrees. As shown in FIG. 5 ,the training module 107 may rotate an original image tile (e.g., theimage tile 510) by 90 degrees to generate an additional image tile 520a. The training module 107 may also rotate the image tile 510 by 180degrees to generate an additional image tile 520 b. The training module107 may also rotate the image tile 510 by 270 degrees to generate anadditional image tile 520 c. Thus, in this example, for each originalimage tile, the training module 107 may generate three additional tiles(e.g., three additional pieces of training data) based on rotating theoriginal image tile—a first additional tile that corresponds to rotatingthe original tile by 90 degrees, a second additional tile thatcorresponds to rotating the original tile by 180 degrees, and a thirdadditional tile by rotating the original tile by 270 degrees. Differentdegrees of rotation may be used and/or different numbers of additionaltiles may be generated in other embodiments. For example, by rotatingthe original image tiles by additional degrees of rotation, a largernumber of additional tiles may be generated.

In some embodiments, instead of or in addition to rotating the tiles,the training module 107 may also generate additional tiles by flippingeach original tile and each additional tile along an axis (e.g., ahorizontal axis, a vertical axis, etc.). For example, by flipping a tilealong a horizontal axis and a vertical axis of a given tile, thetraining module 107 may generate two additional tiles based on the giventile. As shown in FIG. 5 , the training module 107 may generate anadditional image tile 520 d by flipping the image tile 510 along avertical axis 530. The training module 107 may also generate anotheradditional image tile 520 e by flipping the image tile 510 along ahorizontal axis 525. In some embodiments, the training module 107 mayalso generate additional image tiles by flipping the image tiles 520a-520 c. Thus, in the example where the training data system rotate eachoriginal tile by 90 degrees, 180 degrees, and 270 degrees, and then flipthe original tile and the rotated tiles along a horizontal axis and avertical axis, the training module 107 may be able to produce 11additional tiles based on an original tile. As a result, the trainingmodule 107 may increase the amount of training data by 11 folds, wherethe additional training data may cover the instances of patients havingvarious pathologies. Furthermore, the training module 107 may generateadditional tiles by deriving different versions of the same image tiles(e.g., adding varying amounts of artifacts to the same image tiles).

The training module 107 may then train the image segmentation model 108using the generated training data (e.g., the tiles and the additionaltiles). In some embodiments, the segmentation model 108 may beimplemented as a deep convolutional neural network, using techniquesdescribed in the literature titled “U-Net: Convolutional Networks forBiomedical Image Segmentation” by Ronneberger et al., which isincorporated by reference herein in its entirety. As described inRonneberger, when training the image segmentation model 108, each pieceof training data (e.g., each image tile) is first down-sampled through aset of convolution layers, and then up-sampled through a correspondingset of convolution layers. Through the down-sampling and up-sampling oftraining data, the image segmentation model 108 may be trained toidentify boundaries of tissues within an OCT image. After training, theimage segmentation model 108 may be used to identify boundaries oftissues of new OCT images of patients. In some embodiments, the imagesegmentation model 108 may be periodically re-trained using new trainingdata. For example, when a new OCT image is obtained, the training module107 may be configured to generate training data using the methoddescribed herein, and retrain the image segmentation model 108 using thenewly generated training data.

Referring back to FIG. 3 , the process 300 receives (at step 330) asecond OCT image and uses (at step 335) the trained machine learningmodel to perform segmentation on the second OCT image. For example, thebiometrics analysis engine 106 may receive an image from one of the ECPdevices 130, 140, and 150, for example, via the UI server 103. Thebiometrics analysis engine 106 may use the image segmentation model 108to identify boundaries of different layers (e.g., different types oftissues) in the image. In some embodiments, the biometrics analysisengine 106 may divide the image into image tiles, where each image tilehas the predetermined size (e.g., the size determined for generatingimage tiles for training the image segmentation model 108). Thebiometrics analysis engine 106 may provide the image tiles to the imagesegmentation model 108 one by one to obtain identification of boundariesof different layers (or different types of tissues) in the image tiles.

In some embodiments, the biometrics analysis engine 106 may augment theimage by highlighting the different layers or the boundaries of thedifferent layers in the image, and present the augmented image to theECP device. In some embodiments, the biometrics analysis engine 106 mayperform additional analyses to the image based on the identified layers,and may present a report (e.g., a recommendation of a type ofintra-ocular lens or a type of contact lens for a patient, etc.) on theECP device.

FIGS. 6A and 6B are diagrams of processing systems according to someembodiments. Although two embodiments are shown in FIGS. 6A and 46B,persons of ordinary skill in the art will also readily appreciate thatother system embodiments are possible. According to some embodiments,the processing systems of FIGS. 6A and/or 6B are representative ofcomputing systems that may be included in one or more of biometricsanalysis platform 102 and the ECP devices 130, 140, and 150, and/or thelike.

FIG. 6A illustrates a computing system 600 where the components ofsystem 600 are in electrical communication with each other using a bus605. System 600 includes a processor 610 and a system bus 605 thatcouples various system components including memory in the form of a readonly memory (ROM) 620, a random access memory (RAM) 625, and/or the like(e.g., PROM, EPROM, FLASH-EPROM, and/or any other memory chip orcartridge) to processor 610. System 600 may further include a cache 612of high-speed memory connected directly with, in close proximity to, orintegrated as part of processor 610. System 600 may access data storedin ROM 620, RAM 625, and/or one or more storage devices 630 throughcache 612 for high-speed access by processor 610. In some examples,cache 612 may provide a performance boost that avoids delays byprocessor 610 in accessing data from memory 615, ROM 620, RAM 625,and/or the one or more storage devices 630 previously stored in cache612. In some examples, the one or more storage devices 630 store one ormore software modules (e.g., software modules 632, 634, 636, and/or thelike). Software modules 632, 634, and/or 636 may control and/or beconfigured to control processor 610 to perform various actions, such asthe process of method 300. And although system 600 is shown with onlyone processor 610, it is understood that processor 610 may berepresentative of one or more central processing units (CPUs),multi-core processors, microprocessors, microcontrollers, digital signalprocessors (DSPs), field programmable gate arrays (FPGAs), applicationspecific integrated circuits (ASICs), graphics processing units (GPUs),tensor processing units (TPUs), and/or the like. In some examples,system 600 may be implemented as a stand-alone subsystem and/or as aboard added to a computing device or as a virtual machine.

To enable user interaction with system 600, system 600 includes one ormore communication interfaces 640 and/or one or more input/output (I/O)devices 645. In some examples, the one or more communication interfaces640 may include one or more network interfaces, network interface cards,and/or the like to provide communication according to one or morenetwork and/or communication bus standards. In some examples, the one ormore communication interfaces 440 may include interfaces forcommunicating with system 600 via a network, such as network 115. Insome examples, the one or more I/O devices 645 may include on or moreuser interface devices (e.g., keyboards, pointing/selection devices(e.g., mice, touch pads, scroll wheels, track balls, touch screens,and/or the like), audio devices (e.g., microphones and/or speakers),sensors, actuators, display devices, and/or the like).

Each of the one or more storage devices 630 may include non-transitoryand non-volatile storage such as that provided by a hard disk, anoptical medium, a solid-state drive, and/or the like. In some examples,each of the one or more storage devices 630 may be co-located withsystem 600 (e.g., a local storage device) and/or remote from system 600(e.g., a cloud storage device).

FIG. 6B illustrates a computing system 650 based on a chipsetarchitecture that may be used in performing any of the methods (e.g.,methods 300 and/or 510) described herein. System 650 may include aprocessor 655, representative of any number of physically and/orlogically distinct resources capable of executing software, firmware,and/or other computations, such as one or more CPUs, multi-coreprocessors, microprocessors, microcontrollers, DSPs, FPGAs, ASICs, GPUs,TPUs, and/or the like. As shown, processor 655 is aided by one or morechipsets 660, which may also include one or more CPUs, multi-coreprocessors, microprocessors, microcontrollers, DSPs, FPGAs, ASICs, GPUs,TPUs, co-processors, coder-decoders (CODECs), and/or the like. As shown,the one or more chipsets 660 interface processor 655 with one or more ofone or more I/O devices 665, one or more storage devices 670, memory675, a bridge 680, and/or one or more communication interfaces 690. Insome examples, the one or more I/O devices 665, one or more storagedevices 670, memory, and/or one or more communication interfaces 690 maycorrespond to the similarly named counterparts in FIG. 6A and system600.

In some examples, bridge 680 may provide an additional interface forproviding system 650 with access to one or more user interface (UI)components, such as one or more keyboards, pointing/selection devices(e.g., mice, touch pads, scroll wheels, track balls, touch screens,and/or the like), audio devices (e.g., microphones and/or speakers),display devices, and/or the like. According to some embodiments, systems600 and/or 650 may provide a graphical user interface (GUI) suitable foraiding a user (e.g., a surgeon and/or other medical personnel) in theperformance of the processes of method 200.

Methods according to the above-described embodiments may be implementedas executable instructions that are stored on non-transitory, tangible,machine-readable media. The executable instructions, when run by one ormore processors (e.g., processor 610 and/or processor 655) may cause theone or more processors to perform the process of method 300. Some commonforms of machine-readable media that may include the process of method300 are, for example, floppy disk, flexible disk, hard disk, magnetictape, any other magnetic medium, CD-ROM, any other optical medium, punchcards, paper tape, any other physical medium with patterns of holes,RAM, PROM, EPROM, FLASH-EPROM, any other memory chip or cartridge,and/or any other medium from which a processor or computer is adapted toread.

Devices implementing methods according to these disclosures may comprisehardware, firmware, and/or software, and may take any of a variety ofform factors. Typical examples of such form factors include laptops,smart phones, small form factor personal computers, personal digitalassistants, and/or the like. Portions of the functionality describedherein also may be embodied in peripherals and/or add-in cards. Suchfunctionality may also be implemented on a circuit board among differentchips or different processes executing in a single device, by way offurther example.

FIG. 7 is a diagram of a multi-layer neural network 700 according tosome embodiments. In some embodiments, neural network 700 may berepresentative of a neural network used to implement a machine learningmodel for performing segmentation on images such as OCT images asdiscussed herein. Neural network 700 processes input data 710 using aninput layer 720. In some examples, input data 710 may correspond to theinput data provided to the one or more models and/or the training dataprovided to the one or more models during the training process used totrain the one or more models. Input layer 720 includes a plurality ofneurons that are used to condition input data 710 by scaling, rangelimiting, and/or the like. Each of the neurons in input layer 720generates an output that is fed to the inputs of a hidden layer 731.Hidden layer 731 includes a plurality of neurons that process theoutputs from input layer 720. In some examples, each of the neurons inhidden layer 731 generates an output that are then propagated throughone or more additional hidden layers that end with hidden layer 739.Hidden layer 739 includes a plurality of neurons that process theoutputs from the previous hidden layer. The outputs of hidden layer 739are fed to an output layer 740. Output layer 740 includes one or moreneurons that are used to condition the output from hidden layer 739 byscaling, range limiting, and/or the like. It should be understood thatthe architecture of neural network 700 is representative only and thatother architectures are possible, including a neural network with onlyone hidden layer, a neural network without an input layer and/or outputlayer, a neural network with recurrent layers, and/or the like.

In some examples, each of input layer 720, hidden layers 731-739, and/oroutput layer 740 includes one or more neurons. In some examples, each ofinput layer 720, hidden layers 731-739, and/or output layer 740 mayinclude a same number or a different number of neurons. In someexamples, each of the neurons takes a combination (e.g., a weighted sumusing a trainable weighting matrix W) of its inputs x, adds an optionaltrainable bias b, and applies an activation function ƒ to generate anoutput a as shown in Equation 1. In some examples, the activationfunction ƒ may be a linear activation function, an activation functionwith upper and/or lower limits, a log-sigmoid function, a hyperbolictangent function, a rectified linear unit function, and/or the like.Activation function can be non-linear as well such as a rectified linearunit (ReLU) activation function. In some examples, each of the neuronsmay have a same or a different activation function.

a=f(Wx+b)  (1)

In some examples, neural network 700 may be trained using supervisedlearning where combinations of training data (e.g., biometric data ofpatients, etc.) that include a combination of input data and a groundtruth (e.g., expected) output data (e.g., lens products selected by ECPsfor the patients in the past, etc.). Differences between the output ofneural network 700 as generated using the input data for input data 710and comparing output data 750 as generated by neural network 700 to theground truth output data. Differences between the generated output data750 and the ground truth output data may then be fed back into neuralnetwork 700 to make corrections to the various trainable weights andbiases. In some examples, the differences may be fed back using a backpropagation technique using a stochastic gradient descent algorithm,and/or the like. In some examples, a large set of training datacombinations may be presented to neural network 700 multiple times untilan overall loss function (e.g., a mean-squared error based on thedifferences of each training combination) converges to an acceptablelevel.

Although illustrative embodiments have been shown and described, a widerange of modification, change and substitution is contemplated in theforegoing disclosure and in some instances, some features of theembodiments may be employed without a corresponding use of otherfeatures. One of ordinary skill in the art would recognize manyvariations, alternatives, and modifications. Thus, the scope of theinvention should be limited only by the following claims, and it isappropriate that the claims be construed broadly and in a mannerconsistent with the scope of the embodiments disclosed herein.

What is claimed is:
 1. A system, comprising: a non-transitory memory;and one or more hardware processors coupled with the non-transitorymemory and configured to read instructions from the non-transitorymemory to cause the system to perform operations comprising: obtainingan optical coherence tomography (OCT) image; determining edges in theOCT image based on an edge detection algorithm or manual annotations;generating a plurality of image tiles based on the OCT image; generatinga plurality of additional image tiles by manipulating at least one imagetile of the plurality of image tiles; and training a machine learningmodel for predicting edges in OCT images based on the plurality of imagetiles and the plurality of additional image tiles.
 2. The system ofclaim 1, wherein the manipulating the at least one image tile comprisesat least one of rotating the at least one image tile or flipping the atleast one image tile along an axis.
 3. The system of claim 2, whereinthe plurality of additional image tiles comprises image tilescorresponding to rotating the at least one image tile at 0 degrees, 90degrees, 180 degrees, and 270 degrees.
 4. The system of claim 2, whereinthe plurality of additional image tiles comprises an image tilecorresponding to flipping the at least one image tile along at least oneof a vertical axis or a horizontal axis.
 5. The system of claim 1,wherein the edge detection algorithm comprises a graph search algorithm.6. The system of claim 1, wherein the machine learning model comprises adeep convolutional neural network.
 7. The system of claim 1, wherein theoperations further comprise predicting edges in a second OCT image usingthe trained machine learning model.
 8. A method comprising: obtaining abiomedical image; determining boundaries of different tissues in thebiomedical image based on an edge detection algorithm; generating afirst plurality of image tiles based on the biomedical image; generatinga second plurality of image tiles by manipulating at least one imagetile of the first plurality of image tiles; and training, by one or morehardware processors, a machine learning model for segmenting biomedicalimages based on the first plurality of image tiles and the secondplurality of image tiles.
 9. The method of claim 8, wherein thedetermined boundaries correspond to anterior corneal layers of an eyewithin the biomedical image.
 10. The method of claim 8, wherein thegenerating the first plurality of image tiles comprises dividing thebiomedical image into image patches.
 11. The method of claim 10, furthercomprising analyzing one or more characteristics of the biomedicalimage, wherein the biomedical image is divided into the first pluralityof image tiles based on the analyzing.
 12. The method of claim 8,wherein the generating the first plurality of image tiles comprises:analyzing a plurality of different portions of the biomedical image; andselecting, from the plurality of different portions, a subset ofportions of the biomedical image that passes a relevance threshold. 13.The method of claim 12, wherein the analyzing the plurality of differentportions comprises determining whether a portion of the biomedical imagefrom the plurality of different portions include a boundary determinedby the edge detection algorithm.
 14. The method of claim 8, wherein atleast two image tiles in the second plurality of image tiles arepartially overlapping.
 15. A non-transitory machine-readable mediumhaving stored thereon machine-readable instructions executable to causea machine to perform operations comprising: obtaining an opticalcoherence tomography (OCT) image; analyzing the OCT image based at leastin part on an edge detection algorithm; generating a first plurality ofimage tiles based on the analyzing the OCT image; generating a secondplurality of image tiles by manipulating at least one image tile of thefirst plurality of image tiles; and training a machine learning modelfor segmenting OCT images based on the first plurality of image tilesand the second plurality of image tiles.
 16. The non-transitorymachine-readable medium of claim 15, wherein the analyzing the OCT imagecomprises identifying edges in the OCT image.
 17. The non-transitorymachine-readable medium of claim 16, wherein the operations furthercomprise: determining a number of edges identified within each imagetile in the first plurality of image tiles; and selecting the at leastone image tile from the first plurality of tiles for generating thesecond plurality of image tiles based on the number of edges identifiedwithin each image tile.
 18. The non-transitory machine-readable mediumof claim 15, wherein the manipulating the at least one image tilecomprises at least one of rotating the at least one image tile orflipping the at least one image tile along an axis.
 19. Thenon-transitory machine-readable medium of claim 18, wherein the secondplurality of image tiles comprises image tiles corresponding to rotatingthe at least one image tile at 0 degrees, 90 degrees, 180 degrees, and270 degrees.
 20. The non-transitory machine-readable medium of claim 18,wherein the second plurality of image tiles comprises an image tilecorresponding to flipping the at least one image tile along at least oneof a vertical axis or a horizontal axis.